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ABSTRACT 

This study empirically investigated bootstrap bias estimation in the area of structural 
equation modeling. Three correctly specified SEM models were used under four different sample 
size conditions. Monte Carlo experiments were carried out to generate the criteria against which 
bootstrap bias estimation could be judged. For SEM fit indices, bias estimates from the bootstrap 
and Monte Carlo experiments were quite comparable for most of them It is noted that bias was 
constrained in one direction in the Monte Carlo experiments because of the perfect fit of the true 
SEM models. For the SEM loadings and coefficients, the difference between bootstrap and 
Monte Carlo bias estimations was very small, and the distributions of the bias estimators from the 
two experiments were quite similar. For the SEM variances/covariances, the comparison of the 
bias estimator distributions from the two experiments indicated that bootstrap bias estimation 
could be considered adequate. Because the study involved three SEM models which served as 
internal replication mechanism, the likelihood of chance discovery for the findings was small, and 
the findings should have reasonable generalizability. Future studies may extend the current 
findings by examining misspecified SEM models. Data non-normality may be another dimension 
to be considered in future investigation. 
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Structural equation modeling (SEM) has increasingly been seen as a useful quantitative 
technique for specifying, estimating, and testing hypothesized models describing relationships 
among a set of substantively meaningful variables. Much of SEM's attractiveness is due to the 
method's applicability in a wide variety of research situations, a versatility that has been amply 
demonstrated (e.g., Bollen & Long, 1993; Byrne, 1994; Joreskog & Sorbom, 1989; Loehlin, 

1992; SAS Institute, 1990). 

Furthermore, many widely used statistical techniques may also be considered as special 
cases of SEM, including regression analysis, canonical correlation analysis, confirmatory factor 
analysis, and path analysis (Bagozzi, Fornell & Larcker, 1981; Bentler, 1992; Fan, 1996; Joreskog 
& Sorbom, 1989). Because of such generality, SEM has been heralded as a unified model which 
joins methods from econometrics, psychometrics, sociometrics, and multivariate statistics 
(Bentler, 1994). In short, for researchers in the social and behavioral sciences, SEM has become 
an important tool for testing theories with both experimental and non-experimental data (Bentler 
& Dudgeon, 1996). 

Bootstrap method, on the other hand, has also been applauded as one of the newest 
breakthroughs in statistics (Kots & Johnson, 1992). The significance of bootstrapping as a 
versatile nonparametric statistical approach to data analysis has been widely recognized not only 
by those in the area of statistics, but also by quantitative researchers in social and behavioral 
sciences. In the area of education, the recognition for the bootstrapping method was evidenced 
by the invited keynote address delivered by the pioneer in bootstrapping, Bradley Efron, at the 
1995 AERA Annual Meeting (Efron, 1995). 

In social and behavioral sciences, bootstrap method has been used in a variety of research 
situations, and for many different statistical techniques. For example, bootstrap method has been 
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applied in psychological measurement for such issues as differential test predictive validity (e.g.. 
Fan & Mathews, 1994) and item bias (e.g., Harris & Kolen, 1991), and in sociological research 
(e.g., Stine, 1989). The application of bootstrapping method has involved many different 
statistical techniques, including correlation analysis (e.g., Rasmussen, 1987), regression analysis 
(e.g.. Fan & Jacoby, 1995), discriminant analysis (e.g., Dalgleish, 1994), canonical correlation 
analysis (e.g.. Fan & Wang, 1996), factor analysis (e.g., Lambert, Wildt, & Durand, 199, 
Thompson, 1988) , and structural equation modeling (e.g., Bollen & Stine, 1990). In addition to 
its original use as a nonparametric alternative to statistical significance testing (Efron, 1985), 
bootstrap technique has also been advocated as a method of internal replication to assess the 
replicability of results of an individual study (Thompson, 1993). 

As a general nonparametric technique, the application of bootstrap may be most 
appropriate in situations where the statistical theory is weak, or where theoretical assumptions are 
unlikely to be tenable. Structural equation modeling can be considered as an area in which the 
statistical theory is relatively weak (e.g., no theoretical sampling distributions for many model fit 
indices), and in which theoretical assumptions are often untenable (e.g., multivariate normality is 
often violated). Viewed inn this perspective, the application of bootstrap method in SEM can be 
considered quite relevant and desirable. 

It has been observed that the application of bootstrap in SEM could be categorized for 
four different purposes: (1) to estimate bias of sample statistic (bias estimation), (2) to estimate 
standard errors, (3) to construct confidence intervals, and (4) to test SEM models (Yung & 
Bentler, 1996). Despite its conceptual and procedural simplicity, and its versatility in a variety of 
statistical situations, the success of applying bootstrap method as originally proposed (Efron, 
T979) in SEM may not be guaranteed. For example, it has been shown that bootstrap application 
O 
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(as originally defined) has failed in the area of SEM model testing (Bollen & Stine, 1993). 

Because of the uncertainty about its success in SEM, Yung and Bentler (1996) made a general 
suggestion that, in the area of SEM, bootstrap results should not be blindly and indiscriminantly 
trusted; instead, a particular application of bootstrapping (e.g., bias estimation) in SEM needs to 
be investigated until the validity of such application has been reasonably established. 

Of the four major applications of bootstrap in SEM, bias estimation (i.e., to use bootstrap- 
estimated bias of a statistic to represent the true bias of that statistic) has been identified as an 
area where little empirical evidence exists (Yung & Bentler, 1996). Consequently, the validity of 
bootstrap-based bias estimation in SEM is largely unknown. The problem here is that no studies 
have been reported which investigated bootstrap bias estimation for SEM. The most relevant 
studies which investigated bootstrap-based bias estimation actually applied bootstrap in 
exploratory factor analysis models, but not SEM models per se . 

Chatterjee (1984), based on the empirical results from exploratory factor analysis, 
concluded that the bootstrap-estimated bias was very small for loadings in exploratory factor 
analysis. The results of the study, however, may not be very meaningful, for the reason that there 
was no external criterion against which the validity of the bootstrap-based bias estimation could 
be judged; instead, the researcher relied entirely on his “faith” in the bootstrap principle (Yung & 
Bentler, 1996). The study by Ichikawa and Konishi (1995) provided the needed external criteria 
(“true” bias estimation empirically obtained through Monte Carlo simulation) forjudging the 
adequacy of bootstrap-based bias estimation. Their results indicated that bootstrap-based bias 
estimation worked well for rotated factors, but less well for unrotated factors. Although this 
study represents substantial improvement over that of Chatteijee (1984), the fact that both studies 
investigated exploratory factor analysis, not confirmatory factor analysis or full SEM models, 
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considerably limited the generalizability of their results for SEM. As discussed by Yung and 
Bentler (1996), . . bootstrap estimator for biases may not work equally well in all cases .... It 

would seem risky indeed to have blind faith in the bootstrap principle in the hope that bootstrap 
estimation of biases in the context of structural equation modeling will work out correctly.” (p. 
201 ). 

In the most general form, the bias of an estimator 0 for 0 is expressed as 



B(Q) = e(& A ) ( 1 ) 

where 0 is the population parameter of any sort, 0 is the sample statistic for 0, and 0^ is the 
deviation of the sample statistic from the population parameter 0 (i.e., 0 A = 0 - 0), and e(0 A ) is the 
expected value of 0^. In an application, the true bias of the statistic B(0) is usually estimated 
through Monte Carlo simulation. When N samples are drawn from the specified population, we 
have each individual sample statistic deviation 0^. : 



N 






The bias of the statistic is estimated through: 



( 2 ) 



B(B) = e 4 

CM, 2. 3.. N) 



(3) 



In other words, bias estimation is the averaged value of 0^ obtained from repeated Monte Carlo 
sampling from the defined population. ~ - ~ 
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In bootstrap, the bias of a bootstrapped sample statistic is defined as the difference (IL/) 
between the parent sample statistic ((f) and that obtained from a bootstrapped sample ( gj‘) (i.e., 
0^“= 0)“ - 0). When B samples are bootstrapped from the parent sample through sampling 
with replacement, the bootstrap bias estimation is obtained by: 

B*(Q) = e; 

= j E K a *'- 2 - 3 - - <4) 

Obviously, bootstrap principle implies that 0^ in (3) is comparable to the 0/ in (4). 

In addition to the issue of whether or not bootstrap-based bias estimation (Equation 4) is 
systematically comparable with the estimated true bias of a statistic as defined in Equation 3, 
another relevant issue is about the distribution of bootstrap-based bias estimator. It is possible 
that bootstrap-based bias estimation may be comparable with the true bias of a statistic, but the 
variability of the bootstrap-based bias estimator dL/) is substantially different from that of the 
bias estimator (6^) from Monte Carlo experiments. In other words, comparability between the 
bootstrap-based bias estimation and the Monte Carlo bias estimation does not necessarily mean 
the comparability of distributions of the bias estimators from the two situations. 

The present study was designed to provide empirical information with regard to the 
adequacy of bootstrap-based bias estimation in structural equation modeling. The study was 
concerned with two research questions: 

(1) How comparable is the bootstrap-based bias estimation with the “true” bias (as estimated 
from Monte Carlo simulation)? 

(2) How .comparable is the distribution of the bootstrap bias estimator with the distribution of 
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the bias estimator from Monte Carlo experiments? 

Both these two research questions were investigated with regard to: 

(a) SEM fit indices, 

(b) factor loadings and path coefficients in SEM models, 

(c) variances and covariances (for both observed and latent variables) in SEM models. 

METHODS 

Several issues were considered in the design of this study: the generalizability of the 
results of the study, the issue of sample size, and the criterion forjudging the adequacy of the 
bootstrap bias estimation. Attempts were made to accommodate these relevant issues in the study 
design. 

Increasing the Gener ali zability of Finding s 

Two features were incorporated in the study design for the purpose of increasing the 
generalizability of the study results: multiple SEM models, and realistic SEM models and model 
parameters. Because it is always uncertain to what extent the results from a particular SEM 
model would capture the complexity of SEM modeling in general, it is possible that the findings 
based on a particular model may reflect some idiosyncracies associated with the model, and 
consequently, may only have limited generalizability. In this study, to minimize the possibility that 
the idiosyncracies of a particular model may be mistaken for something more generalizable, three 
different SEM models (one confirmatory factor analysis model and two full structural models) 
with varying degrees of model complexity were simulated. Such a design of multiple models 
provided a very useful mechanism for verifying the internal validity of the findings. As is generally 
recognized, internal validity is the necessary, although not sufficient condition of external validity 
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(i.e., the generalizability of the findings) (Gall, Borg, & Gall, 1996). 

In addition to the multiple SEM models, this study adopted realistic SEM models which 
were based on substantive research studies. As suggested by Gerbing and Anderson (1993), 
simulating substantively meaningful models might increase the external validity of a study. Not 
only were all the three SEM models simulated in this study from substantive research studies, but 
the model parameters simulated also closely matched those in the substantive studies. Of the 
three SEM models simulated in this study, one is a confirmatory factor analysis model originally 
appeared in the research report by Calsyn and Kenny (1977), and later discussed as a 
confirmatory factor analysis model example by Joreskog and Sorbom (1989, p. 83). The second 
SEM model is a full structural equation model discussed by Joreskog and Sorbom (1989, p. 178) 
which was based on some longitudinal data from a study conducted by the Educational Testing 
Service. The third model is also a full structural equation model which originally appeared in the 
work by Wheaton, Muthen, Alwin, and Summers (1977), and which was later widely discussed in 
the SEM literature (e.g., Bentler, 1992; Joreskog & Sorbom, 1989; SAS Institute, 1990). To 
limit the scope of the study, only statistically true models are simulated, although, ideally, models 
with model specification error should also be studied in the future. Figure 1, Figure 2, and Figure 
3 present the three SEM models used in this study, and their respective model parameters in 
LISREL terms. 

Insert Figure 1, 2, 3 about here 



Sample Size 

It is unclear how large a sample should be in SEM applications. The research findings on 
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this issue are inconclusive (MacCallum, Roznowski, & Necowitz, 1992; Tanaka, 1987). It has 
been reported that small sample size led to not only untrustworthy estimation results and fit 
indices, but also the high rate of improper solutions in simulation (Ichikawa & Konishi, 1995). 
Sample size of 200 in SEM applications has been considered as being relatively small by many 
(Boomsma, 1982; Camstra & Boomsma, 1992; Ichikawa & Konishi, 1995; MacCallum, et al., 
1992). Some researchers even consider sample sizes in the thousands to be adequate (e.g., Hu, 
Bentler, & Kano, 1992; Marsh et al., 1988). Realistically, however, such large sample sizes are 
often beyond the reach of most researchers. Also, as pointed out by many researchers, it may not 
be realistic to suggest a single value to define a small or large sample, because models and the 
number of free parameters vary from application to application, so consideration of sample size 
should be related to model complexity and the number of free parameters (MacCallum et al., 
1992; Tanaka, 1987). 

In this study, four sample size conditions (n=100, 200, 500, 1000) were simulated. As 
discussed above, there are no objective criteria on this issue. Based on our experience and our 
perception about the current practice in this area, we felt that these sample size conditions were 
quite representative of the current practice, and the four sample size conditions were also 
considered to be reasonably spaced. 

“True” Bias as the Criteria for Judging Bootstrap Bias Estimation 

As discussed in the literature review section, in order to understand the adequacy of 
bootstrap bias estimation, external criteria must first be obtained against which bootstrap bias 
estimation can be compared. In this study, the external criteria (“true” bias) were obtained 
empirically through Monte Carlo simulation. The three SEM models, the four sample size 
conditions, and 200 replications within each cell made up a total of 2400 (3*4*200 = 2400) 
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random samples to be fitted to one of the three SEM models. Based on the Monte Carlo 
experiments, “true” bias for an SEM statistic (&) was empirically obtained through Equations 2 
and 3. 

Within each cell condition described above, one sample was generated to served as the 
parent sample for bootstrapping, and 200 samples were bootstrapped from this parent sample 
using “completely nonparametric bootstrap” approach (Yung & Bentler, 1996). In this bootstrap 
phase of the study, a total of 2400 samples were obtained through bootstrapping resampling, and 
fitted to one of the three SEM models. The bootstrap bias estimation for the statistic (&) was 
obtained through Equation 4. The Monte Carlo bias estimate B(0) and the bootstrap bias 
estimate B‘($) were empirically compared to check if the bootstrap bias estimation B'(&) was 
systematically different than the Monte Carlo bias estimate B($). In addition, the sampling 
distribution of the bootstrap bias estimator &/ ) was compared with that of the Monte Carlo bias 
estimator (^j) to check how comparable the bias estimator distributions were. 

Data generation was accomplished by using the SAS normal data generator. Multivariate 
normal data were simulated using the matrix decomposition procedure (Kaiser & Dickman, 

1962). All data generation, bootstrap resampling, model fitting (SAS PROC CALIS) were 
accomplished through the SAS system (SAS Version 6.11). 

RESULTS AND DISCUSSIONS 

Because the amount of results which could be presented was huge, our presentation of the 
results was selective, and we tried to focus on the main questions. Because all the simulation 
results have been saved, some additional analyses can be conducted if there is enough interest. 

Our presentation of the results and the related discussion followed the order of the research 
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questions presented earlier in the paper. 

Comparability between Bootstrap Bias Estimation and Monte Carlo Bias Estimation 

The first research question asks, “How comparable is the bootstrap bias estimation with 
the “true” bias in SEM?” To answer this question, we presented the results separately for (1) 

SEM fit indices; (2) loadings and path coefficients in SEM; and (3) variances and covariances in 
SEM (for both observed and latent variables). 

SEM fit indices . Table 1 presents the bootstrap bias estimation [B'($)] and the Monte 
Carlo bias estimation [B($)J for a variety of SEM fit indices. A negative sign indicates a 
downward bias (the estimate smaller than the parameter), and a positive sign (no sign) indicates 
an upward bias (the estimate larger than the parameter). A close look at the table reveals several 
observations. First, both the bootstrap and the Monte Carlo has downward bias for all the fit 
indices except the x 2 and the RMR (root mean square residual), for which there is upward bias. 
For Monte Carlo results, this is expected, because all the models are true models, and the fit for 
the defined population is perfect. In other words, for true SEM models, the population parameter 
values of these indices function either as the “ceiling” (all except x 2 and RMR) or the “floor” (x 2 
and RMR) of the possible values of the statistics. As such, bias of the statistic is constrained to be 
in one direction only: downward (for fit indices with a “ceiling”) or upward (for indices with a 
“floor”). 

The fact that the bootstrap bias estimation is consistently in the same direction as the 
“true” bias (from the Monte Carlo results) for all the fit indices is somewhat remarkable. This is 
because, as Bollen and Stine (1993) discussed, the bootstrap resampling space (i.e., the bootstrap 
parent sample) does not represent the null model (i.e., the correctly specified true model). This 
fact means that, for these fit-indices, the bootstrap parent sample statistic may not function as the 
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“ceiling” or “floor”, as the population parameters do in Monte Carlo simulation; and some 
bootstrapped samples may have better fit index values than those of the parent sample. In spite of 
the difference in this regard, bootstrap still produces bias estimation consistent in direction with 
the Monte Carlo estimated bias. 

The second observation from Table 1 is that, for most of the fit indices, bootstrap bias 
estimation is also quite comparable in magnitude with the Monte Carlo estimated bias, probably 
with the exception of RMR, CFI, CENTRA, NNFI, and DELTA2. The comparability in 
magnitude for x 2 , GFI, AGFI, NFI, RHOl, PGFI, and PARMS are quite remarkable. The other 
five indices, however, showed less comparability between the two kinds of experiments. Of these 
five fit indices, bootstrap estimated bias is consistently smaller in magnitude than the bias from the 
Monte Carlo experiments for RMR, but bootstrap estimated bias tends to be larger than the 
Monte Carlo bias for the other four. 

In general, the bootstrap method performed quite well in bias estimation for most of the 
SEM fit indices examined above when compared with the bias estimated from Monte Carlo 
experiments. As expected, the magnitude of bias decreases with the increase of sample size for 
both the bootstrap estimated bias and the “true” bias. It is important to note that the findings 
discussed above are evident from the three SEM models with different data characteristics, so it is 
unlikely that these observations may be the results of the idiosyncracies of any particular model or 
the data structure associated with it. 

Loadings and path coefficients in SEM . Table 2 presents the comparison of bootstrap 
bias estimation with the “true” bias for the factor loadings and coefficients of the three SEM 
models (A x , A Y , I\ and B matrices in LISREL terms). Because the three models have different 
loadings and coefficients. Table 2 actually consists of three sub-tables, each for one SEM model. 
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It should be noted that these are bias estimates for the unstandardized coefficients and loadings, 
not for the standardized coefficients and loadings. The specified values of these original loadings 
and coefficients, however, range approximately from 0.2 to 1 .2, similar to standardized 
coefficients in terms of the scale range. Several observations can be made about Table 2. First, 
the absolute magnitude of the estimated bias (both from the bootstrap and the Monte Carlo 
experiments) for these statistics is very small for the overwhelming majority of them. Typically, 
the bias only appears at the third or fourth decimal place for these loadings and coefficients. Only 
a few entries show bias at the second decimal place, and one or two entries show bias at the first 
decimal place. The small bias in the majority of the cases suggests that, for practical research 
applications, bias of these SEM statistics is minimal, and may safely be ignored without much loss 
of information. 

Relatively speaking, however, contrary to the situation about SEM fit indices where high 
degree of comparability between the bootstrap and Monte Carlo bias estimates was observed for 
most of the fit indices, there does not appear to be a consistent pattern between the bootstrap and 
Monte Carlo bias estimates for these loadings and coefficients, either in terms of bias direction, or 
in terms of bias magnitude. In some cases, bootstrap bias is larger than the Monte Carlo bias, 
while the reverse is true in others. The same can also be said about the direction of bias in the 
two situations. It may be argued that the minuscule amount of bias in both situations may have 
made the relative difference between the two (bootstrap and Monte Carlo bias estimates) 
unimportant for practical purposes. Again, as expected, the magnitude of bias decreases with the 
increase of the sample size. 

Variances and covariances in SEM . Table 3 presents the bias estimation for the variances 
and covariances in SEM (0 6 , 0 C , <&, and matrices in LISREL terms): Like Table 2, Table 3 
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actually consists of three sub-tables, each for one SEM model. It should be noted that these are 
estimated biases for the variances and covariances in SEM models, and the specified parameter 
values of these variances and covariances may be quite large (ranging from 1.6 to 115; see Figures 
1, 2, and 3 for model specifications). Because the original values may be quite large, we need to 
be careful in interpreting the magnitude of these bias estimates. For example, for the coefficient 
4> n in SEM Model 1, the bootstrap estimated bias and the Monte Carlo estimated bias for sample 
size condition N=100 is approximately 2.5 (though in different directions). This value of bias may 
sound large if we have not switched from the measurement scales in Table 2. But the specified 
value for <J>„ was 105, and percentagewise, the bias is only about 2.3% of the parameter value, 
and comparable to a bias of 0.012 for a correlation coefficient of 0.5. 

Like Table 2, there is a lack of consistent pattern between bias estimates from bootstrap 
and Monte Carlo experiments, both in terms of bias direction (upward or downward) and in terms 
of bias magnitude. Although, in many cases, bootstrap estimated bias appears to be larger than the 
Monte Carlo estimated bias in terms of magnitude, the opposite is also abundant in the same table. 
The increase of the sample size, as in Table 1 and Table 2, has the expected tendency of reducing 
the magnitude of bias for both bootstrap and Monte Carlo estimated bias. 

Another observation about the bootstrap and Monte Carlo experiments is related to the 
issue of improper solutions. In this study, samples (either in bootstrap or Monte Carlo 
experiments) with improper solutions were excluded from further analyses, and they were not 
replaced with new samples. It is observed that the occurrence of improper solutions is higher for 
bootstrap experiments than for Monte Carlo experiments. The occurrences of improper solutions 
under different sample size conditions and for the different SEM models are presented in Table 4. 
It is seen that improper solutions mainly occurred for N=100 and N=200 conditions, especially for 
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SEM Model 2. 



Insert Table 4 about here 



Comparability of Bias Estimator Distributions: Bootstrap ( (§&/ ) vs. Monte Carlo ( 0^) 

The second research question asks, “how comparable is the distribution of the bootstrap 
bias estimator and that of the Monte Carlo bias estimator?” Because so many different kinds of 
statistics are involved for different models and under different sample size conditions, it is difficult 
to describe the distributional characteristics of all the estimators in a concise and easy-to- 
understand manner. Instead of trying to be complete, we graphically presented a limited number 
of cases to illustrate our key observations with regard to the distributional characteristics of the 
bootstrap bias estimators vs. the Monte Carlo bias estimators. 

SEM fit indices. Figure 4 presents the comparisons of bootstrap bias estimator 
distributions with those of Monte Carlo bias estimators for four SEM fit indices [GFI (SEM 
Model 2, N=100), NFI (CFA Model, N=200), x 2 (CFA Model, N=200), and RMR (SEM Model 
1, N=100)]. In this and in the following Figures 5 and 6, black bars represent the distributions of 
the bootstrap bias estimator, while the shaded bars represent the distributions of Monte Carlo bias 
estimator. As discussed previously, because the true SEM models were simulated in the Monte 
Carlo simulation, these fit indices have either a “ceiling” (GFI and NFI) or a “floor” (x 2 , and 
RMR). As a result, the bias from the Monte Carlo experiments could only be either all negative 
(downward bias, for GFI and NFI) or all positive (upward bias, for x 2 , and RMR). On the other 
hand, for the bootstrap resampling space (i.e., bootstrap parent sample), the models were not true 
SEM models (see Bollen & Stine, 1993, for the discussion related to the issue). As a result, the 
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bootstrap bias theoretically could be either positive or negative for all these fit indices. This 
difference underlies the disparity between the distributions of the bootstrap bias estimator and the 
Monte Carlo bias estimator. The distributional difference of the bootstrap and Monte Carlo bias 
estimators is very similar for GFI, NFI, and x 2 , although x 2 is opposite in direction. 

Insert Figure 4 about here 

In Table 1 presented earlier, bootstrap bias estimation for RMR is consistently smaller 
than the “true” bias. Figure 4 (d) reveals the distributional differences of the bootstrap bias 
estimator and the Monte Carlo bias estimator. Contrary to Figure 4 (a), (b), and (c) where the 
overlap of the two bias estimator distributions was substantial, for RMR, the two distributions are 
quite different in terms of both the location and the shape of the distributions, with the location of 
the bootstrap bias estimator being much closer to zero. This explains why bootstrap bias 
estimation was much smaller than that based on Monte Carlo simulation as presented in Table 1. 

The disparity between the bootstrap bias estimator distribution and that of the Monte 
Carlo bias estimator, however, may partly be the consequence of simulating true SEM models 
with their inherent “floor” or “ceiling” for the fit indices, which constrains the bias to be only in 
one direction for Monte Carlo experiments. But the bootstrap bias estimation is less constrained 
in this regard, because the bootstrap resampling space represents a misspecified SEM model, 
rather than a true SEM model as in the Monte Carlo simulation. It is likely that the disparity 
between the bootstrap bias estimator distribution and that of the Monte Carlo bias estimator 
would be smaller than what has been observed in Figure 4 if misspecified SEM models are used in 
the study. This possibility, of course, needs to be empirically examined. 

SEM loadings and coefficients. Figure 5 presents the distributional comparisons between 
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the bootstrap bias estimator and the Monte Carlo bias estimator for four SEM loadings and 
coefficients [A^, (CFA Model, N=200), A Y32 (SEM Model 1, N=200), Yu (SEM Model 1, 

N=500), and p (SEM Model 2, N=100)]. It is seen that, for all the four coefficients, the 
distributions of the bootstrap bias estimator and the Monte Carlo bias estimator are quite 
comparable in terms of their range and shape, indicating that the bootstrap bias estimator is close 
to the Monte Carlo bias estimator. Notice that, for these loadings and coefficients, there do not 
exist any constraints caused by a theoretical ceiling or floor, as in the case for the fit indices. As a 
result, bias estimator can be distributed in both directions (positive and negative) for both the 
bootstrap and the Monte Carlo experiments. The similar distributions of the bootstrap and the 
Monte Carlo bias estimators support the earlier discussion that the minuscule difference between 
the two bias estimations observed in Table 2 is most probably inconsequential. 



Insert Figure 5 about here 



SEM variances and covariances. Figure 6 presents the distributional comparisons 
between the bootstrap bias estimator and the Monte Carlo bias estimator for four SEM variances 
and covariances [6 33 (CFA Model, N=200), e 44 (SEM Model 1, N=100), <\> u (SEM Model 1, 
N=500), and e 13 (SEM Model 2, N=200)]. Similar to Figure 5, the distributions of the bootstrap 
bias estimator and the Monte Carlo bias estimator are quite comparable in terms of their range 
and shape, indicating that the bootstrap bias estimation is adequate. Unlike the SEM fit indices 
discussed previously, for these (co)variances, there is no theoretical “ceiling” or “floor” which 
constrains the bias in certain direction for Monte Carlo bias estimation. The similar distributions 
of the bias estimators from the bootstrap and the Monte Carlo experiments suggest that the 
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difference between the bootstrap and the Monte Carlo bias estimations observed in Table 3 can be 
considered small and inconsequential. As pointed out before, it is somewhat difficult to judge the 
magnitude of bias for these variances and covariances because of their measurement scales. 



Insert Figure 6 about here 



SUMMARY AND CONCLUSIONS 

This study examined the application of bootstrap method for bias estimation in the area of 
SEM. Three correctly specified SEM models were used under four different sample size 
conditions. Monte Carlo experiments were carried out to generate the criteria against which 
bootstrap bias estimation could be judged. For the Monte Carlo experiments, 200 replications 
were simulated within each cell condition, and the “true” bias of a statistic was obtained. For the 
bootstrap experiments, one parent bootstrap sample was generated under each cell condition, 
which served as the resampling space for bootstrapping within the cell. Two hundred bootstrap 
samples were obtained from this parent sample through sampling with replacement. Bootstrapped 
bias of a statistic was obtained and then compared with the bias of the statistic obtained from the 
Monte Carlo experiments. Both the amount and direction of bias, and the distributions of the bias 
estimator, from the bootstrap and Monte Carlo experiments were compared. 

For SEM fit indices, bias estimation from the bootstrap and Monte Carlo experiments was 
quite comparable for most of them in terms of both the direction and magnitude. It is noted that, 
because true SEM models were simulated, bias was constrained in one direction in the Monte 
Carlo experiments, but less so in the bootstrap experiments. Because of this, bootstrap bias 
estimator has wider dispersion, and it was less skewed. It is hypothesized that if misspecified 
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SEM models were used in the study, the disparity between the bias estimator distributions from 
these two kinds of experiments would be reduced, because in this situation, the less-than-perfect 
population fit indices would function less as a constraint for the bias estimation in Monte Carlo 
experiments. 

For the SEM loadings and coefficients, the difference between bias estimations from the 
bootstrap the Monte Carlo experiments was very small, although there did not appear to be a 
consistent pattern. The distributions of the bias estimators from the two experiments were quite 
similar both in terms of the shape and range, suggesting that the bootstrap bias estimation was 
adequate because its bias estimator was comparable to that from the Monte Carlo experiments. 

For the SEM variances/covariances, the picture is similar to that about SEM loadings and 
coefficients. Because the measurement scales were quite different for these 
variances/covariances, caution was warranted in interpreting the magnitude of the estimated bias 
for these variances/covariances. The comparison of the distributions of the bias estimators from 
the two experiments indicated that bootstrap bias estimation could be considered adequate 
because of the similar distributions of the bias estimators from the two experiments. 

To our knowledge, this is the first reported empirical study which provides direct 
empirical evidence with regard to the adequacy of bootstrap bias estimation in SEM. It is our 
hope that the findings from this study will make meaningful contribution to the SEM literature. 
Because the study involved three SEM models which served as internal replication mechanism, we 
are reasonably confident that the likelihood of chance discovery for the findings was small, and 
the findings should be generalizable. Future studies may extend the current findings by examining 
SEM models with some degree of misspecification, as discussed in this paper. In addition, data 
non-normality may be another dimension which should be considered in future investigation. 
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Table 4 Number of Samples with Improper Solutions 



Sample N 


Experiment 


SEM Models 




CFA 


SEMI 


SEM2 


100 


Bootstrap 


15 


17 


41 




Monte Carlo 


1 


2 


22 


200 


Bootstrap 


0 


0 


29 




Monte Carlo 


0 


0 


6 


500 


Bootstrap 


0 


0 


3 




Monte Carlo 


0 


0 


0 


1000 


Bootstrap 


0 


0 


0 




Monte Carlo 


0 


0 


0 
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Figure Captions 

Figure 1. Confirmatory Factor Analysis Model and Model Specifications 
Figure 2. Structural Equation Model 1 and Model Specifications 

Figure 3. Structural Equation Model 2 and Model Specifications 

Figure 4. Comparison of Bias Estimator Distributions for Four SEM Fit Indices (Black 

Bars. Bootstrap; Shaded Bars: Monte Carlo) 

Figure 5, Comparison of Bias Estimator Distributions for Four SEM Loadings and 
Coefficients (Black Bars: Bootstrap; Shaded Bars: Monte Carlo) 

Figure 6, Comparison of Bias Estimator Distributions for Four SEM Variances/Covariances 
(Black Bars: Bootstrap; Shaded Bars: Monte Carlo) 
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(a)^ (CFA, N=200) (b) X Y32 (SEMI, N=200) 





(c) y„ (SEMI ; N=500) (d) p 21 (SEM2, N=100) 



Figure 5 




39 



4 



c i •* 



Bootstrap Bias Estimation in SEM 36 





(a) 5 33 (CFA, N=200) (b) e 44 (SEMI, N= 100) 





(c) cj) 21 (SEMI, N=500) (d) e 13 (SEM2, N=200) 



Figure 6 




40 



® 

U.S. Department of Education 

Office of Educational Research and Improvement (OERI) 

National Library of Education (NLE) 

Educational Resources Information Center (ERIC) 

REPRODUCTION RELEASE ™028952 

(Specific Document) 



I. DOCUMENT IDENTIFICATION: 



"^ e * Bootstrap Estimation of Sample Statistic Bias in Structural 


Equation Modeling 


Author(s): Xitao Fan , Bruce Thompson 


Corporate Source: 


Publication Date: 


Utah State University 


April 13, 1998 



II. REPRODUCTION RELEASE: 



In order to disseminate as widely as possible timely and significant materials of interest to the educational community, documents announced in the 
monthly abstract journal of the ERIC system, Resources in Education (RIE), are usually made available to users in microfiche, reproduced paper copy, 
and electronic media, and sold through the ERIC Document Reproduction Service (EDRS). Credit is given to the source of each document, and, if 
reproduction release is granted, one of the following notices is affixed to the document. 

If permission is granted to reproduce and disseminate the identified document, please CHECK ONE of the following three options and sign at the bottom 
of the page. 






Level 1 Level 2A Level 2B 



t 



t 





Check here for Level 1 release, permitting reproduction 
and dissemination in microfiche or other ERIC archival 
media (e.g., electronic) and paper copy. 



Check here for Level 2A release, permitting reproduction 
and dissemination In microfiche and in electronic media 
for ERIC archival collection subscribers only 



Check here for Level 28 release, permitting 
reproduction and dissemination in microfiche only 



Documents will be processed as Indicated provided reproduction quality permits. 

If permission to reproduce is granted, but no box is checked, documents will be processed at Level 1 . 



Sign 

here,-* 

please 

3 

ERIC 



1 hereby grant to the Educational Resources Information Center (ERIC) nonexcl 
es indicated above. Reproduction from the ERIC microfiche or electronic m 
contractors requires permission from the copyright holder. Exception Is made fo 
to satisfy information needs of educators in response to discrete inquiries . 


usive permission to reproduce end disseminate this document 
edia by persons other than ERIC employees and its system 
r non-profit reproduction by libraries and other service agencies 


Xitao Fan / ^ "fc 


Printed Nwne/Positionffitis: 

Assistant Professor 


Organization/Address: / 

Dept, of Psychology 


Telephone: 

C435m7-1451 


FAX (435) 797-1448 


Utah State University, Logan, UT 84322-2810 


E-MaU Address: 

f af an@cc . usu . edt 


Oats: 

April 28, 1998 



(over) 







Clearinghouse on Assessment and Evaluation 



University of Maryland 
1129 Shriver Laboratory 
College Park, MD 20742-5701 

Tel: (800) 464-3742 
(301) 405-7449 
FAX: (301) 405-8134 

March 20, 1998 ericae@ericae.net 

http://ericae.net 

Dear AERA Presenter, 

Congratulations on being a presenter at AERA 1 . The ERIC Clearinghouse on Assessment and Evaluation 
invites you to contribute to the ERIC database by providing us with a printed copy of your presentation. 

Abstracts of papers accepted by ERIC appear in Resources in Education (R1E) and are announced to over 
5,000 organizations. The inclusion of your work makes it readily available to other researchers, provides a 
permanent archive, and enhances the quality of RIE. Abstracts of your contribution will be accessible 
through the printed and electronic versions of RIE. The paper will be available through the microfiche 
collections that are housed at libraries around the world and through the ERIC Document Reproduction 
Service. 

We are gathering all the papers from the AERA Conference. We will route your paper to the appropriate 
clearinghouse. You will be notified if your paper meets ERIC's criteria for inclusion in RIE: contribution 
to education, timeliness, relevance, methodology, effectiveness of presentation, and reproduction quality. 

You can track our processing of your paper at http://ericae.net. 

Please sign the Reproduction Release Form on the back of this letter and include it with two copies of your 
paper. The Release Form gives ERIC permission to make and distribute copies of your paper. It does not 
preclude you from publishing your work. You can drop off the copies of your paper and Reproduction 
Release Form at the ERIC booth (424) or mail to our attention at the address below. Please feel free to 
copy the form for future or additional submissions. 

Mail to: AERA 1998/ERIC Acquisitions 

University of Maryland 
1129 Shriver Laboratory 
College Park, MD 20742 

This year ERIC/AE is making a Searchable Conference Program available on the AERA web page 
(http://aera.net). Check it out! 




Sincerely, 




Lawrence M. Rudner, Ph.D. 
Director, ERIC/AE 



'If you are an AERA chair or discussant, please save this form for future use. 



O 

ERIC 



CUA 




The Catholic University of America 




